by Oded Loewenstein
Hi and welcome to my blog!
Today I’m going to try to analyze happy moments and to try to think how businesses could use such an analysis to their advantage. Happy moments is a corpus of 100,000 happy moments collected from various countries across the globe (https://rit-public.github.io/HappyDB/).
Businesses nowadays strive to optimize their operation. Using the happy moment database, businesses can learn the differences in the reasons behind what makes people happy under various segmentations. This will allow them to optimize their opeations by catering to each segment by offering them prodcuts or services that cater to their specific needs and desires.
To do so, we first need to define a “subject” for each happy moment, the “reason” that moment was associated with happiness. A very intuitive technique for such an analysis is the “tf-idf” technique, which measures the importance of word to a specific text. Thus, the most important word would define the document best. this measure is constructed from 2 terms:
(1) Term frequency - the number of times a word occurs in a document
(2) Inverse document frequency - because some words are more common than others (for example, “the” is a very common word but it’s unlikely that it is the subject of a document), this measure decreases the weight of words that occur frequently across many documents.
By multiplying the two terms, we get the tf-idf function, which essentially provides the importance of a word to it’s text or in other words, tells us the “subject” or the “reason” for the happy moment.
After constrcuting the tf-idf and learning the subject of every happy moment, let’s look at some of the conclusions that could be infer from the results:
The 10 most common “subjects” of happy moments in USA
The 10 most common “subjects” of happy moments in India
Now, Let’s step into Explora’s shoes for a second example: if people in India are primarily happy with from things related to the temple and shopping, Explora could market mainly trips to sacred and religous places with shopping possibilities. However, in the US Explora will mainly market trips with events or, as metioned eariler, romantic trips.
By mapping the most frequent happy moments geographically, a company could easily optimize sales by understanding what makes most people happy in that country. A simple example of the most common “subject” in 5 selected countries:
The most common “subject” of happy moments in 5 selected countries
Finally, further segmentations could be made to optimize even further the marketing in each area, like distinguishing between genders. Let’s take Mamazon, an online merchandise retailer, as an example. The graphs below illustrate the 10 most frequent happy moment subjects among women and men the US.
Comparison between women’s and Men’s most common “subject” of happy moments
We can easily observe that for women, events is much more associated with happiness then for men. We can also observe that while for both gender the spouse has a similar frequency of importance, women’s other main sources of happiness are still family (daughter, grandchildren), while for men it is success and hobbies (money, work, guitar, fishing). Therefore, Mamazon could put a higher emphasise on advertising gifts for family members when the user is a women, and gifts for the actual user when the user is a male.
Conclusion
To conclude, we witnessed how text processing of happy moments could be a integral tool for businesses to optimze their marketing and services offered using the tf-idf function. It should be noted that with some modifications (combining similar words or “reasons” mentioned before is just the tip of the iceberg), we could reach even better results.
Reference
Akari Asai, Sara Evensen, Behzad Golshan, Alon Halevy, Vivian Li, Andrei Lopatenko, Daniela Stepanov, Yoshihiko Suhara, Wang-Chiew Tan, Yinzhan Xu, ``HappyDB: A Corpus of 100,000 Crowdsourced Happy Moments’’, LREC ’18, May 2018. (to appear)